-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Vectorize @ inbounds for x in A ...
#13866
Conversation
This would previously have been an infinite loop if `length(A) == typemax(Int)` so the loop vectorizer couldn't compute a trip count.
Actually, maybe we should we start L31: cmpq %r8, %rcx
jae L82
movq (%rdi), %rdx
Source line: 4
Source line: [inline] float.jl:269
subq %rsi, %rdx
vcmpneqsd (%rdx), %xmm0, %xmm1
vmovd %xmm1, %edx
andl $1, %edx
addq $-8, %rsi
Source line: 3
incq %rcx
Source line: 4
Source line: [inline] float.jl:269
addq %rdx, %rax
cmpq %rcx, %r8
jne L31 vs. L37: cmpq %r8, %rcx
jae L95
Source line: 4
Source line: [inline] float.jl:269
leaq (,%rsi,8), %r10
Source line: 3
movq (%rdi), %rdx
Source line: 4
Source line: [inline] float.jl:269
subq %r10, %rdx
vcmpneqsd (%rdx), %xmm0, %xmm1
vmovd %xmm1, %edx
andl $1, %edx
incq %rcx
decq %rsi
addq %rdx, %rax
cmpq %rsi, %r9
jne L37 OTOH, LLVM 3.6 appears to be smarter and this doesn't make a difference there, so maybe this isn't worth it? |
Possibly related to #9182. |
Vectorize `@ inbounds for x in A ...`
This would deserve a comment explaining why the function isn't written in the most natural way. Especially since this isn't covered by the tests, which means anybody might break this by rewriting it to an apparently better form. |
Currently, if a vector is resized in the midst of iteration, then `done` might "miss" the end of iteration. This trivially changes the definition to catch such a case. I am not sure what guarantees we make about mutating iterables during iteration, but this seems simple and easy to support. Note, though, that it is somewhat tricky: until #13866 we used `i > length(a)`, but that foils vectorization due to the `typemax` case. This definition seems to get the best of both worlds. For a definition like `f` below, this new definition just requires one extra `add i64` operation in the preamble (before the loop). Everything else is identical to master. ```julia function f(A) r = 0 @inbounds for x in A r += x end r end ```
Currently, if a vector is resized in the midst of iteration, then `done` might "miss" the end of iteration. This trivially changes the definition to catch such a case. I am not sure what guarantees we make about mutating iterables during iteration, but this seems simple and easy to support. Note, though, that it is somewhat tricky: until #13866 we used `i > length(a)`, but that foils vectorization due to the `typemax` case. This definition seems to get the best of both worlds. For a definition like `f` below, this new definition just requires one extra `add i64` operation in the preamble (before the loop). Everything else is identical to master. ```julia function f(A) r = 0 @inbounds for x in A r += x end r end ```
* More robust iteration over Vectors Currently, if a vector is resized in the midst of iteration, then `done` might "miss" the end of iteration. This trivially changes the definition to catch such a case. I am not sure what guarantees we make about mutating iterables during iteration, but this seems simple and easy to support. Note, though, that it is somewhat tricky: until #13866 we used `i > length(a)`, but that foils vectorization due to the `typemax` case. This definition seems to get the best of both worlds. For a definition like `f` below, this new definition just requires one extra `add i64` operation in the preamble (before the loop). Everything else is identical to master. ```julia function f(A) r = 0 @inbounds for x in A r += x end r end ```
This would previously have been an infinite loop if
length(A) == typemax(Int)
so the loop vectorizer couldn't compute a trip count. Ref #13860 (comment)